mysql delete语句使用where in不走索引的问题解决


本文总阅读量

mysql delete语句使用where in不走索引的问题解决
1、背景介绍:
想要删除某个表中重复行的数据
先使用聚合函数查询到哪些数据是重复的:
select update_time from test_data group by update_time having count(1) > 1;

2、根据update_time数据相同的值找到对应的id,根据主键id进行删除:
delete from test_data where id in (select id from (select min(id) as id,update_time from test_data group by update_time having count(1) > 1 ) as t);

然后查看这个sql的执行计划
explain delete from test_data where id in (select id from (select min(id) as id,update_time from test_data group by update_time having count(1) > 1 ) as t);

1
2
3
4
5
6
7
8
9
root@db 12:16:  [tmp_opa_user]> explain delete from test_data where id in (select id from (select min(id) as id,update_time from test_data  group by update_time having count(1) > 1 ) as t);
+------+--------------------+------------+----------------+---------------+-----------------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+--------------------+------------+----------------+---------------+-----------------+---------+------+---------+-------------+
| 1 | PRIMARY | test_data | ALL | NULL | NULL | NULL | NULL | 1085134 | Using where |
| 2 | DEPENDENT SUBQUERY | <derived3> | index_subquery | key0 | key0 | 9 | func | 108513 | |
| 3 | DERIVED | test_data | index | NULL | idx_update_time | 7 | NULL | 1085134 | Using index |
+------+--------------------+------------+----------------+---------------+-----------------+---------+------+---------+-------------+
3 rows in set (0.001 sec)

这种方法没有走主键索引,删除耗时:

1
2
root@db 12:27:  [tmp_opa_user]>  delete from test_data where id in (select id from (select min(id) as id,update_time from test_data  group by update_time having count(1) > 1 ) as t);
Query OK, 81241 rows affected (4.669 sec)

3、更改delete删除方法,使用delete inner join的方法:
sql如下:
explain delete a from test_data a inner join (select id from (select min(id) as id,update_time from test_data group by update_time having count(1) > 1 ) as t) d on d.id = a.id;

1
2
3
4
5
6
7
8
9
10
explain delete a from test_data a inner join  (select id from (select min(id) as id,update_time from test_data  group by update_time having count(1) > 1 ) as t)  d on d.id = a.id;
+------+-------------+------------+--------+---------------+-----------------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+------------+--------+---------------+-----------------+---------+------+---------+-------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 1003893 | Using where |
| 1 | PRIMARY | a | eq_ref | PRIMARY | PRIMARY | 8 | d.id | 1 | |
| 2 | DERIVED | <derived3> | ALL | NULL | NULL | NULL | NULL | 1003893 | |
| 3 | DERIVED | test_data | index | NULL | idx_update_time | 7 | NULL | 1003893 | Using index |
+------+-------------+------------+--------+---------------+-----------------+---------+------+---------+-------------+
4 rows in set (0.000 sec)

执行计划提示1执行删除的时候使用了索引:

1
2
root@db 12:22:  [tmp_opa_user]> delete a from test_data a inner join  (select id from (select min(id) as id,update_time from test_data  group by update_time having count(1) > 1 ) as t)  d on d.id = a.id; 
Query OK, 81241 rows affected (4.120 sec)

删除8万条数据执行了4.1s

4、以上两种方法,3的方法会比2的方法快,尤其是在根据二级索引删除的时候,方法3比方法2提升N个档次

1
delete from test_data where update_time in (select update_time from (select update_time from test_data group by update_time having count(1) > 1 limit 1000) as t);
1
delete a from test_data a inner join (select update_time from (select update_time from test_data group by update_time having count(1) > 1) as t) d on d.update_time = a.update_time;

感兴趣的可以用这两个sql测试一下性能
5、扩展
如果感觉一次性删除数据量太大,可以使用limit来限制条数,例如:
delete a from test_data a inner join (select id from (select min(id) as id,update_time from test_data group by update_time having count(1) > 1 ) as t limit 1000 ) d on d.update_time = a.update_time;

目录

Proudly powered by Hexo and Theme by Lap
本站访客数人次
© 2020 zeven0707's blog